Tripartite Graph Models for Multi Modal Image Retrieval
نویسنده
چکیده
Most of the traditional image retrieval methods use either low level visual features or embedded text for representation and indexing. In recent years, there has been significant interest in combining these two different modalities for effective retrieval. In this paper, we propose a tri-partite graph based representation of the multi model data for image retrieval tasks. Our representation is ideally suited for dynamically changing or evolving data sets, where repeated semantic indexing is practically impossible. An undirected tripartite graph G = (T,V,D,E) has three sets of vertices where, T = {t1, t2 . . . , tn} are text words, V = {v1,v2 . . . ,vm} are visual words and D = {d1,d2 . . . ,di} are images with E = {ed1 t1 , ..,e di tn ,e d1 v1 , ..,e di vm , et1 v1 , ..,e tn vm} as set of edges. Figure 1 pictorially represent the tripartite graph model (TGM) we use.
منابع مشابه
Exploiting Multimedia Content: a Machine Learning Based Approach
This thesis explores use of machine learning for multimedia content management involving single/multiple features, modalities and concepts. We introduce shape based feature for binary patterns and apply it for recognition and retrieval application in single and multiple feature based architecture. The multiple feature based recognition and retrieval frameworks are based on the theory of multipl...
متن کاملLook, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models
Textual-visual cross-modal retrieval has been a hot research topic in both computer vision and natural language processing communities. Learning appropriate representations for multi-modal data is crucial for the cross-modal retrieval performance. Unlike existing image-text retrieval approaches that embed image-text pairs as single feature vectors in a common representational space, we propose ...
متن کاملBayesian non-parametrics for multi-modal segmentation
Segmentation is a fundamental and core problem in computer vision research which has applications in many tasks, such as object recognition, content-based image retrieval, and semantic labelling. To partition the data into groups coherent in one or more characteristics such as semantic classes, is often a first step towards understanding the content of data. As information in the real world is ...
متن کاملMMSS : graph-based multi-modal story-oriented video summarization and retrieval
We propose multi-modal story-oriented video summarization (MMSS) which, unlike previous works that use fine-tuned, domain-specific heuristics, provides a domain-independent, graph-based framework. MMSS uncovers correlations between information of different modalities and gives meaningful story-oriented news video summaries. MMSS can also be applied for video retrieval, achieving performance tha...
متن کاملImage Retrieval: Content versus Context
In this paper, we introduce a new approach to image retrieval. This new approach takes the best from two worlds, combines image features (content) and words from collateral text (context) into one semantic space. Our approach uses Latent Semantic Indexing, a method that uses co-occurrence statistics to uncover hidden semantics. This paper shows how this method, that has proven successful in bot...
متن کامل